66 S7e 



C6 B13 Idl 



TITLl 

sfovs isitrcf 

PDB DKfB 

sow 



EDBS PIICB 
DESCSlfTOBS 



IDISTIFIIBS 



St«vart« Douqlat K, ' 

Iwaluatlon fee Criiihal Jottiee Agencies i 
Froblei- Oriented Olscussloti. 
National Inst, of Lav EsfoEceMent and 
Justice (Dept. of Justlce/LIAA) « Vaehlngton, O.C. 

Sep 78 . 

UBp- . ^ , 

Sup«rlntend"«nt qt tf^u»enti, O.s. Goieniment Pri 
Office, Washington, C.C. 20a02 (Stcck No. 
027-000-007 10-H) . . • ^ 

Hf-$0.83 HC-i2.06 floi icstage. . . 

♦ivaluatlon Hethoasi *l Bi-titutlonal Seseacchi Heeas ^ 
Assessment I *Organlzational If £tctl¥en#si| .•Pollcf 
foriation: *Prograi Evaluations lesearcfi Bithodqlogyi 
B^search Problensi State igfnelfs • 6 \' 

•Criminal Justice Agenciei , , V 4- 



IBSTBllCT . 

This report discusses consid«r atloni in?c3?«.d in 
placing the eTalu&.tion process for crliinal joitice agencies within 
an organizational and practical context. Thf discusfilon picceids from 
the following perspectivesi (1) progrii eval.oatlon is a 
policy /Bahageient tool; (2) varion^ lewis cf pclicy and BanageiBent 
personnel have numerous and varied evaluaticn Informatlcn needs; and 
(3) rarelr is an evaluation so fatallj flawed ai to be without soae 
relevance to policy. The Report identiflas Fotentlal prcblems In the 
conduct of program evaluation so that they can fce anticlFattd, 
assessed and prevented^ Pitfalls in intetfieting data for alternative 
policy purposes are examined. Concerns to t« addressid fccfore data 
collection begins are analyzed to ■iniiife impediments to a 
sucsessf ul evaluation. It is noted that during the data acquisition 
and data analysis stages, certain inteiFretaticnal problems must he 
considered --including potential difficulties cf tiansferilng 
programs to new environments or of exfandicg prcftaBS. The final 
stage of the evaluation cycle is discussed in terms cf cenverting 
problems into products. The report Includis a tltliographj , and 
technical discussions of variables, coitelaticn, and experiments 
appear in the appendices. (Author) ■ . 



* Reproi actions supplied by 

♦ f^om the 



IDBS at€ the b€st that cac hm Bade • 
original docuBitnt, * 




CviJuation f or 

Justice ^^eacd^i 

iscusslon 



V 



ip4 1 

O 

O 

u 



U.i.OtMllTMlNTePNiALfM, 

MAtiOMAL iNiTiTUTI OP 
teUCATieM 

f Jl*2*^*£®**'^^ OP'VilW Off OPINIONS 

liWt OPPrClAt NAT iOMAL iNiTlTUTI ©P 
teUCATt^ POItTlON OR P^»€V 



N^iOMl Instltiita ef Law enforcement and Criminal Juatlae 

Law Enfor^ffiant distance Administration 
U. S. Department of Justice 




EKLC 



Valuation for 
ICrimlnal Jimttce Agencd^: 
^roblrai-OrieAted Discussion 



. by 
Douglas K.^Stewirt 



Septsmb«r 1978 




National Inttltuta of Law Irtforcamant and Criminal Juatlea 

Law Enforcemant Assistance Admnistfitlon 
U. ^. DepartTtent of Justlcff 

4» 



For nl« by iupedntfindflBt of DooumeaUi U.S, OoTfirnment MQliof OSIcs 
SlMk No, ^COHWTiM ^ 



Nattonal InstKute sf Law inforeement 
and Criminal Justtot 
Kair Q. Ewing. Aeting DIreator 

Law Enforoament Asslstanca Adminfatration 
James M. Grrngg, Acting AdministratDr 



Thil pr^tct was suppotled by Contract Number 6-0S43-J-LE^, awarded by 
ths National Instftute of Law Enforeemant and Criminal Justice, Law En- 
for^ment Atsistarioe'Adminiitratlon, U, S. Deparlmant of Justice, undet the 
Omnibus Crime Control arrcl Safe Streets Act of 1 968, as imended. Points of 
^ vfeWor opinions stated in this document are thosa of lha authors and do not 
r^^isarily repre^nt the official position or policies of the U.S. Department 
of Justin. 



CHAPTER I 



A. 
B. 

C. 



CHAPTER II; 



CHAPTER III 




TABLE OF CONTENTS 



INTRODUCTION 

The Roles of, Evaluation I. 

Adaptive Evk,).uation. / • 

Evaluation Credibility and . 
Acceptability ^ • 

RESEARCH PROBLEMS AND UNMET ASSUMPTi6nS 

Programs Exist as Legislated or 

Planned 

j Programs and Program Environments are 
Stable over Time 
Conventional Criteria or Gbals are 
Appropriate to Project and Program 
Evaluation * * . * 

>ITERPRETIVE PITFALLS 

"Anything will Work*' for a tereat 

Leader « * • * * • • • • • * * 

Cross "Cultural I Transfer Can be 

Problematic 
Operating Environments Change * • * # 
If a Little Bit Works... 
Individuals are not Groups'. The 

Ecological Fallacy * * • * - • - • 
donstrained Popu^^ons i Selective 
Recruitment and Differential 
Attrition 
Absence of T^tal Rigor Totally In- 
validates the Results 6f a Study 



CHAPTER IV: 



A. 



C. 



CAPITALIZING ON XdVe^SITY 



Deviation o^.. Programs from tegisla- 
;^ive or Planning Specifications a 

liuable for Poiicy and Planning 
Audiences 
That Programs are Dynamic way 
Indicate Modes of Institutidnal 
Learning which can be Transferred 
Attempts to , Conceptualise and Opera 
tionalize ''Appropriate" Objective 
and. Goals^ can Impact Planning and 
Legislative Language and Procedur 



re" 



s 

es 



page 
1 
2 

3 , 
• J 

4' 
7 

9 
12 

14 

16 

16 

17 
18 
19 

20^ 
22 

'is 



2% 



.25 



2^6 



27 



Tabl« of Conf nti (pontlnuod) 

D. Pailur* to F.ind a "Peeudo Controi 
Group" i« Itself a Finding, ' 
Perhaps Relative to Reeruitnient 
Ifficiency . . . ' 

NdTES .4 . , . . 

SELECTED BIBLIOGRAPHY .... 

. 

APPENDIX As DISCUSSION I VARIABLES 

APPENDIX Bi DISCUSSION I CORRELATION ....... 

APPENDIX Cs DISCUSSION I EXPERIMENTS . 



AiSTRACT 

• . r 

J 

4 

This rtport discussis consldiratloni Involved In placing the 
•valuation process within an organizational and practical context. 
Th« discussion proceeds from the following perspectives? Program 
•valuitlon Is a pollcy/iiianagemtnt tool. Various levels of policy 
and managMent personnel have numerous and varied evaluation 
Infomiatlon needs. Rarely is an evaluation so fatally flawed as 
to Hithdut SON rtlevanct to policy. 

The rtport Identifies potential proMems In the conduct of 
program evaluation so that they can be antfclpated, assesjied and 
pfe-empttd. Pitfalls In Ihterpreting data for alternatl^t PoUfyc 
purposes are «xanlned. Concerns to be addressed befqrf fatt tW-*^ 
lection begins are analyzed to lilnfmize Impediments to 4^^suecessf ul 
■valuation. Durlftg the data acquisition and data anajl^s itaA«St 
certain InterpretatlonSl prbblims must be coniWered'»- Including ^ 
potential difficulties of transferring prograM^to^nl^ environments 
or of expanding programs. The fVial Jtage' iifT^iifaiuatlon cyc>e^ 
Is discussed In 4«rms of convert r^^t*|M«» 1nto..products. 



The;rt0ort Ir^des a blbllggrafihyt «nd technical discussions 
of varii|flis, correlation, and rfat^riinents appear In the appenalces 



I 

\ 




0 



I 



CHAPTER I 
INTRODUCTION 

The mandate^ if not demand, fo* quantitative prograin eva^ 
luations is widespread within the criminal justice area. Such 
evaluations operate in diverse environments, and serve varied 
audiences* Differences surrounding * program evaluations may , 
cause the practitioner to lose hope of diaeevarlng any eamraan 
principles to guide the design and conduct of evaluation work. 

The purpose of this document is to place the evaluation 
process within the context of organizational purposes and 
practical constraints. More specifically^ we are concerned 
that the ©valuator and the evaluator's audiences appreciate 
certain problems and pitfalls which may be encountered. None 
of these problem areae has been concocted. Each has been en- 
countered by the author during the conduct of one or more 
criminal justice evaluations. 

The specific concerns of this report can be briefly 
sununarized as follows: * 

* program evaluation is a policy/management toed; 



• nuiperous and varied evaluation information neqds 
exist across various levels of policy and manage- 

V'^/" ment personnel; 

an evaluation need seldom be so fatally flawed 
as to be of no policy relevancei ■ 

• several potential problei^ areas in the condijct 
of program evaluations can be anticipated^ 
assessed and-^hopaf ul ly--preemptedi 

pitfalls, in the interpretation of data for alter- 
native policy purposes can be identified and 
their rami rt cat ions apprei iated , 

The first three points supply the orientation for this 
discussion while the latter two describe the substantive con- 
tent of the report, * 



folieifltig fufth«r dlseuBpisn of thm rolm and eoAtasct of 
proaran tvniufttioii In the rMi«lnd«f of thiit ohAptari thrtt 
■ubitMtlvt ohapt«rs ars pif#t«nt«di 

• In Chapter II we dlseuia lmpedi,inents to a 
■uooaaaful •valuation In termfl of design 
problaniB and unmet assimptlons — concerna to 
foe addraeied prior to data collection » 

• Chapter III considers certain Interpretational 
problems Including potential problems of progr^ 
transfer to new environments or simply progjfam 
expanslon^ — considerations appropriate both 
during data acquisition and data analysis « 

^ Finally^ Chapter IV discusses the conversion 

of problems into products--in some sense « the ^^^^^^^^^^^^ 
final stage in the evaluation cycle. 

Additionally^ several basic technical appendices are 
^included for use by the general reader. 



A* The Roles of Evaluation 



Evaluations of social programs are often thought to be akin 
to the award of academic grades to school students-*a means by 
which to identify those who are "better" and to distinguish ^hem 
from those who are not. That is, program evaluation is /thought 
to be a tool by which programs may be designated as "doriig 
fine,'' suited for "repair" or "the junk yard," Thus, a manual 
entitled^ "Quick Evaluation Methodology" suggests that "Quick ' 
evaluationp were designed to be of use to decision-makers facing 
the following problems: 

' whether to continue funding a particular treatment 
. program, and, if so, at what levels 

whether technical assistance should be provided to 
a particular program, and, if so, what type; and 

• if an entire city's programs are analyzed , whether 

funding. of a proposed new program appears warranted," 

The author goes on, however, in a discussion of potential limi= 
tatiqns of quick evaluations, to notei 

• • • a quick evaluation does not address the 
question of whether a coimnunity needs that 
particular treatment program^ a quick 
evaluation only assesses the performance of 
that program. The implications of not funding 



m Mdlaar« mthAdone iRAlntanance program mrm 

quita different for a aonununity whera that la 

tha only auch program than for a community 

whara thara ara aavara]. others.- m 

It ahould alao ba noted # by way of elaborating the above comment 
that auch an evaluation will be unlikely to touch on conatitU'- 
tional and other ia^uea aurrounding drug treatment in qenersli 
and mathadona treatment more specifically. 

Daniel Glaaer, an eKperienced^^valuator of correctional ^ 
ayatema, Btatedi 

Before further discussion^ it will jt^ acknow- 
ledged that often the most effective way to 
reduce the extent to which people are labeled 
deviant ig not to change their behavior but 
to change the labeling practices ao that they 
are no longer considered deviant* For example ^ 
instead of trying^ to change people so they 
will cease the moderate use of marihuana, we 
can cease regarding this practice an warranting 
their being changed ...^ 

Of course, determining what is to be defined as deviant sounds 
very close to policy formulation and 'that is our point; 

Social pro^^am evaluation research is under- 
taken as a basis for settlinq questionB of 
pol icy . ' ^ . 

Thus, we should not restrict ournclves as cvaluatot.^ to awardinn 
the educator's version of an academic qrade for: 

We must learn to look at our objectives bb, 
critically an^^as prof ess lorially as we look 
at our models' ^nd our other inputs. ** 

The program evaluator toosoften focuses on the lower level 
questions (the equivalent of grade assignment) without recog- 
nizing the potential for suppiying expert information of the 
higher order variety-^ frequent ly due to limitations which are 
self imposed by the evaluator, but often because he is forced to 
do so by the sponsor* Our point is not tha% the award of a grad 
for performance efficacy t impact , etc . is without meri t , but 
rather th#t the evaluator should be aware of the breadth of 
opportunity available for program evaluations. 



B. Adaptive Evaluation > 

/ . 

The above noted range of potential topics for an evaiuation 
study and the uncertainty copfrontinq any but. the most trivial o 

^ ■ ■ 

-3- 

\ - - ' 



■tuditi ■UffMtt thm adaptive and tivnlut Innary naturaa of motm of 
th« b«at atudlaa. An •valuation atudy may well atart nut to aak 
\hm qu«ationi "hon did it work In a qiven localf^?" Thii la i nd^f*f! 
appropriata for tha vary immadiata quest inna raiaed ahovpi 

ahould fundlnq be trrmlnntc^d? 

ahould technical aaslstancn hi' m^do /iv<iilabln? 

On tha other hand, varioUB policy maker b may ho conrcH nod with 
queationa of the followinq varinty* 

d 

doea th^ proqram opora te an ant i r i p^it imI? 

ale there unan t i r i p,! t rd rfmspff urnrcB of t )u' 

proqram which eithoi cJilut i^ or timpl i f y t ho 

proq ram's not bono fits? ^ 

what aro tho irnpl i Ciit icuih of oxpand i nq* t ho 
lovol of oporAtinnn of tho prnqriim? 

impacts on (or cn\\tn tci) nthor 
, proqramH ; t 

. . r o B o u r c o . ayai 1 n b i \ i t y { o . q . , ' r t a f f ) 
and need lev© 1 . 

The point Ib that most ovnlunticni stuciios will not by (^r^Rinn nnt.i = 
cipate all the poBB 1 b 1 o iT«^t i «^ns of tho cibovo ^nrt . A flaxiblf^ 
and adaptivo doBiqn, hc^wovt^A HhrnUd bo pro|if^rod 4n nttond to t hr^ 
unant icipatod , Aniily^o it cind dotc^rnino auprnpricito nnlirv 
makers who should bo int|'roBto(i in i^uch Rpocial purpo^o roptM ts 
as may l|k^ producod. All c^f which in to nuu<|ont that nn rnnk Nwikn 
exist for this Rtylf^ of politry related roHoarcti and thatf iruiood, 
one may be wiso to hv wilHnq tn (entertain intuitionn and in*>inhts 
as well as **hard'* data. This^tnpic will bo furtlier c^dflronnod ^^uh = 
ftequently; hnwpvor^ nur r»mpha-^iR here ib that no disCusBion can 
qive you a set of prnredinos wh i c^h will quarantrM^ ttuit ynu aro 
able to extract the maximun^ Kiqnifirant jnil i cy-re 1 (UNint inf(^!n^a = 
tion* given your evaluatinn fundn. Thus, tho \)uryH}Hv of tMr 
current document tn p^rovide a certain orient#tion to rho 
extremely complicatnd f>rocM^ss aTid hot'o that , liKt* a r»it , y^ u ran 
land on your f et:*t . 



C. Evaluation t:ri'd i b i 1 i t*/ and Arrepf at:, i 1 i ty 



It was Go o r q e n< ^ r n a r d i-- h /i w w h o n c U c^i that " t. ^ \% * r y f r n f r h • 
is a conspiracy anainst thi^ laity," and wr^ wouhl wr^Jl f 
those wo r d s in mind w h ( * n r o n p, i dr r i n ' i t h o i ^ r (^h Ivrn^ i Y^*' \m 1 i k i * 
may confront in "market uv?" f i rid i tun^ anci ron<' 1 un i c^r^R to pcU m 
makers^^ the "laity." ThiTo ih a sonni* in which ovaluatiori 



interests run counter to orann i 7.1 1 i oj; 1 r^.t errst s In n 



UM^ 1 o 



-4- i. 



J ■ . 

QMitiofitt Aaron ltiIdAVtky,R(lk«a thm point i 

MhQ will •VAlu«t« and who will adminlatar? 

how will iwwar hm divided amanq thrap funr- 
tionarlaa? 

• . which onaa will bear^'the coat of ehanqf*? 

can •VAluatora cr#atf» ■ufflcif*nt ■t/ihiH^V 
to ciirry nn thrlr own work in thn midiit cif 
a turbulent environment? 

can authority be allncatrd to nvaluators 
and blame appoitioned amunq administ ratnra? 

how to convinca adminiatratora to coii|.ct 
inforTiiatlon--that miqht hrlp <)thrr« but ran 
only harm them? 

^ , how can suppnrt hv f^htainf»rt nn behalf of 
/ recommenda t i that .inqnr BponnnrB? 

can knowtedqt! and power be iolna^ 

And as a si^nmary re^spon^** to t hf^ tibove evaluatnra' questions* the 
followinq is offerrd" 

Pure f^Vr'i 1 u^t 1 VP man^ howovpr s i nq 1^-mi ndod his 
cuncenti at ion on the intrinaic merits of pro- 
qrdmH, muBt alno rcn^isitlor thrir Interaction 
effpct^ nn hiH fur vim ability to pursue hiH 
^ craft* Just as he would inaiBt on includinn 

the im|)tTf't of aiw rlnm€*nt in a system nn 
another in hifl policy analysis, »o must he 
.conBidor how h\9> prrsnn't r^commendat ions 
affect fututc^ onoR . A proper cwaluation 
/ Includes the impact of a policy on the oroa- 

nitatinn^ responsible* for it. 

The point* in ^hc simplistic abstract, is that an evaluation 
needs to be designed with the reporting context understood. Thus 
we return to the tension between the evaluator who would aaaisn 
an academic qrade only to a proj'ect, and the evaluator who wduld 
redesign the cosmos. It is not a siraple matter, but we desire 
that more try to be both the era f tsmein of the former sort as well 
as bemq sensitive to unexpectec! in format ion and willmq to move 
i in the **visionarv*' direction in order to better serve vrfryinn ^ 
policy makers. 

As data are qathered and experience broadened in the conduct 
of an* evaluation , insiqb\ts and intuitions are likely to be 
genelrated which were absent m the original design process. To. 
the degree possible, data collection C|n reflect theit new 



perspectives, whether by amending data. acquisition instruments 
"^or by ^ developing new prbcedures* Moreover, analyses may be niodi = 
^ fied and expanded in an Atternpt to pursue the post-'design 
insi^ghts. What, in general, is not terribly credible or accep- 
table is an unsubstantiated intuition, although at the end of a 
high quality ^ presentation , the evaluator may wish to deliver 
"hitches'" which could inform future research. 



-1- 



I. 



-6- 



ERIC 



CHAPTER II 

' RESEARCH PROBLEMS AND UNMET ASSUMPTIONS 

Within thii eection, discussion will be focused on pre- 
paring the evaluation design for the environment within which 
it will operate (and, by implication, suggest those facets of 
the environment which need to be changed in the interest of 
evaluation)* We use the term, "environment," because we see 
the evaluator standing between operating programs and policy 
makers* The later are the clients and the former constitute 
the Qtojects of evaluation. Put another way, the policy makers 
constitute the "demand side" and the operating programs con-^ 
stitute the "supply side." Given the flexible and adaptive 
stance advocated here, it is important for the evaluator to 
recognize his role, coupling these two^ constituencies. 

The primary . determination to be made by the evaluator in 
this context is the identity of the various consumers of the 
proposed evaluation study. Following this initial step, the 
evaluator dan begin to fill out a "Requirements Checklist 
which could look something like the following i 



REQUIREMENTS CHECKLIST ^ 
Who are the consumer^ of the evaluation? 
Specifically what do they need to know? 
What is already known? 

a. Is a "true" (i.e., randomly assigned) 
control group required? 

b. Or is a "reasonable" contrast group 
ft sufficient? 

q. Or no such things? 

d! What sample sizes are necessary for 
the required level of precision? 

Is anything approaching an adequate design 




-7- i.. 

™* =4. 



r ' ■ 

^' 

capable of implementaltion, given time and 

other Gonstrainti? , \ 

....... ^, ^ ^ 

6* Overall, *^7^u think the proposed study 
should be undertaken — that is, can it 
possibly (likely?) yield information of 
value (for whbm)? 

It should be clear that this ch^klist will not be completed 
at one sitting; rather^ it is one means of tracking the evolving 
design process. A second checklist may be termed the "Assumpttons 
Checklist" and could take the following formi 

■ ■ • ■/ • • ■ . • 

ASSUMPTI,pJS CHECKLIST 

1, Program elements assumed. 

I 

2, Activities assumed, f v 

3, Documentatiqft assimted, 
4* Ob jectives assumed , 

5 . Control/contrast groups assumed . 

6, Sample/sizes, etc. assumed,. 

7 , ^^ther data elements assumed. 

8 . Cooperation and acceB^ assumed , 



The Assuniption Checklist is^ clearly # a typification of the 
evaluation design in terms of the research situation v;hich is 
anticipated/ It is the importance of clarifying what is eKpected 
by a design which serves as our central theme. Much grief and 
many false starts can be avoided if these assumptions are checked 
before the evaluation design is_ implemented . This is not to say 
that all uncertainty can ever b^ removed from the evaluation 
process^ but certain precautionary measures can be very productive* 

A quick determination of the plausibility of the vatious 
assiimptions can be made through relatively simple use^ of various 
information sources 



interviews w»h program staff 

search of program files 

analysis of program budaet(s) 
^• 

/ 

acquisition of external documents 



• ' . interviews with appropriate others 

In the following page% we will discuss the individual assunip- 
tiona, problems which can arise if they are not met, how the above 
-'"s-ources can be used to determine the plausibility of the- assump- 
tions and how to modify designs to cope witW unmet assumptions.. 

Several of the assumptions to be discussed share m common 
Aiieir derivation from enabling legislation and other desOriDtions 
If what the program "should be" and what people "should do." (^If 
people always did what they were supposed to do, the Army wouldn t 
need sergeants." — Anonymous.) 



A., Programs Exist as Legislated or Planned 

r ^ ' ■ ■ . 

A program, as described, is often composed of numerous ele- 
ments. A court diversi^ program might be described as, including 
medical, vocational, legal, psychological and transportation com- 
ponents. With this in mind, an elegant evaluation design might 
be constructed which Would assess the integration of the several 
components. The reality of the situation qoulfi turn out to be 
one in which there are several harried case workers whose duties 
are largely undifferentiated. In this event, the effort whi.ch 
went into the evaluation design would be largely wasted as the 
design is unsuitable to the reality. Before undertaking the 
design work, interviews with program staff and inspections of 
files and budgets; could have turned up the facts of the matter. 
Research questions which emerge in reference to the newly deter- 
mined situation includes 



how are case assignme 




how is case managemen^^pality overseen? 

how is' continuity assured in the face of staff 
turnover? , ' 

The problem of assumed activities may be viewed as the 
process side of the assumed elements problem just discussed. To 
continue the example of the court diversion program, the evalua-. 
tion design miy have assumed a relatively complete and sophisti- 
cated intake "work-up" involving psychological profile, medical 
histoRy, work and criminal histories, etc. If this activity is 
not undertaken by a program, the evaluator may encounter diffi- 
culties because certain analytic uses had,be^n intended for such 
data elements. (This overlaps with the fdllowing discussion 
concerning documentation assumed.) For exafr.ple , . the design may 
have anticipated using intake data to "match" prooram participants 
with non-participants, adjusting "outcome" in terms of background, 
etc. In short, the design can be in real difficulty. Again, 
interviews with program staff, examination of files and budgets 



-9. 16 

o 

ERIC 



may assist ^the evaluator m determining what activities are going 
on prior to creation Of the final design. 

If the missing data elements are considered crucial to' a 
successful evaluation, the appropriate intake procedures may need 
to be implemented-^at least on a sample basis* Otherv;ise, a more 
modesjt and less powertEul evaluation design may be undertaken and 
surrogate data elements sought. On the other hand ^ "missing ac- 
tivities" can be of far more substanfce thah a "missing data" 
problem. For example, let us assume that the m.y^ical court 
diversidri program was intended to emphasize special serviqes for 
female clients (such as child care facilities). A major focus of 
the evaluation design Aight well emphasize this program versus 
othrt*(diver sion programs without such^ female oriented acti'^ities* 
If in fact the female oriented activities are absent or absurd 
(the child care facility is a rat infested room with po security) 
then the design is largely inappropriate and certainly inef- 
f icient--- the tesources to be expanded/ in contrasting female orien- 
ted with non-^female oriented programs/ are productive of very 
little. The study design should thufi drop this facet while 
perhaps adding a concern with the po^^ilDle effects of unmet ■ 
eKpectations on the part of female clients, ■ 

/ i 

Most evaluation designs anticipate the existence of Certaiji 
groups of observations for use in dgveloplna comparisdns or con- 
trasts. At minimum is the assumption that som.ebody or something 
has. "received the treatment." To e|cpand^ a design typically 
assumes some number of entities havp Darticipatied in the program 
to be evaluated. Moreover^ ^an evaliuation will t\picany anticipate 
developing contrasts or comparison^ between participants and non- 
participants. As mentioned previoiisly, the experimental ideal 
requires that participation and no^=participatiqn be randomly 
determined. Lacking random assignment by the investigator, 
second bast conditions obtain wherte "nature" appears to have 
acted capriciously in assignment tip participation V£. non- 
participation groups (i.e.^ withou^ bias). An example of an 
intended use of capr iciotisness in [the environment appears ij\ an 
unpublished Federal Bureau of Prisbns document in which the 
research design had initially^ \ 

planned on selecting, for the comparison group , 
^federal offenders released directly to the 
*/ community (rather than through a Community 

Treatment Center^-C , T , C , ), who are eligible and 
have need for C.T.C, placement, but for some 
reason ( perhaps lack of bed space in the area 
o,f release ) were not referred to a C*T.C. 
(Emphasis added, ) 

This arrangement would be close to ideal, although the wise 
skeptic would inquire into the nature of the referral process. 
Where this situation appears to hold, it is important to make 
certain minimal checks on the similarities of the two groups." 
The important point in dealing with anything other than randomly 




formed groups is that one must guard against confusing a previous- 
Iv" existing difference between groups with a post interyention 'or 
treatment effect . 

.The point of this discussion' is simply that if a design is 
■dependent on some sort of treatment and comparison groups, their 
Existence should be coniirnved prior to implementation of the 
design. If the existence of such groups cannot be confirmed, then 
the design needs to be modified (e.g., a quasi-experimental 
design) or the objective of conducting a quantitative evaluation 
dropped entirely. 

Many evali^tion designs assume the existence of various 
kinds of dociOTentation regarding program activities, participants, 
etc. Quite- frequently, descriptions of information systems are 
very widely off the mark and those which do exist may, "In fact, 
only be accessible with much'tmanual effort. For example, the 
evaluation of the court diversion program may^have anticipated 
sampling among program "graduates" based on a list of such per- 
sons. That list may not exist. Instead, the evalUator may be 
confronted by a list of program intakes and dates terminated with 
no recorded information regarding reason for termination. Or, 
non-comparable lists may be maintained by different programs 
(e.g., the definitions of "graduate" may di ffer--one. program may 
call an "intake" someone with one contact with the program where- 
as another does not "log" a person as an intake until after three 
months of participation, etc.). The problem of documentation is 
probably one of the most troublesome confronting the evaluator . 
within the criminal justice system- (probably the equal of mssmg 
groups" to be discussed) . Whether the assumed existence of docu-^_ 
mentation is based on legisla,tion and regulation or "common .sense 
it should nev6r be permitted 'to guide an evaluation design. 
Indeed, a program director's word that certain data elements 
exist is insufficient. The evaluator . should undertake a %iirtu- 
lation of the intended procedures and receive very specific 
definitions of terms used, etc. Even when record keeping is not 
-sloppy, it should be emphasized that administrative record systems 
are not constructed and maintained with the evaluator in mmd. 

It is our contention that the'^eakness of m^y data systems 
is the reason for many interview or survey type Studies. In the 
case of our evaluation of the court diversion prbject, assume we 
had been able to obtain a list of "graduates" (who are similarly 
defined across programs) and then desired to search some criminal 
justice information system relative to future arrests, convictMna 
and parole revocations. This information system may require 
birth date and race in addition to name in order to screen for 
duplicate names (i.e., more than one person with identical nam.es), 
data elements which may not be available from the program' s _ files. 
Moreover, the criminal justice information system may "know 
about only those arrested, etc., subsequent to some date. In 
this case^ "success" may be defined in terina of omission^nd 
various obvious pitfalls can be encountered under these^Hpditions 



' v To summarize? « ' 

" : ' ■ " ' ' " \ 

do not assume information exists; - \/ 

do nof, assume existing^ inforfftation is rcradiiy 
accessible; 

do not a^sime definitiond are coipsistenti ; ♦ 

' ' " ^ , " ■> '! 

do not assume * infqfmation systems ar^" compatibjle^ 

, analyie the pfocess of inclUB4.qn and exclupion 4 
for possible effects on your pi^poses- ' " 

The other side of the coin, of course, is that existing % 
information is almost free and ma^ be unique in the case of past 
iofo^mation* When^ desired information is , not available or is ^ 
fiawed (in one. of the above senses) , it wili be necessary 't6 
develop new 'information through interviews etc, , -or alter the , 
intended study design. * . . ^ , ' 



B, Programs and Program Environments Are Stable Over Time 



^ programs, especially new afPd 'innovative ones, constantly 
change in major ways as they .respond to the internal processes^ e¥ 
development and implementation and to external demandi an^ pres-' 
Bures from clients or other interes ted^roups , It^^is impc&'tan^ 
to determine the amount of program and Operating epvironment ^ 
variability during the design stage of an evaluation so th&t 
stability is not mistakenl^ assumed. < . ) 

; / " ^ ■ ' ^ ' ^ ^ ^ 

Two distinct steps i)4ed to be urtdertaken in regard ^^o the 
'potential for instabilitV^ ^ \ ^ . 

^determination of amount and nature .of chanaei , 

impact jof^, change on evaluation design . 4 

Just as in the preceding section, interviews , with various crrgups 
involved with the program as ^well as iTispection of files and 
other documentary sources. rfhy be adequate to determine the kind 
and quantity of change surtTDunditig the object of the intended 
evaluation study. Some of thfe ki^s of change to look fdr-, are 1 

turn^over in staff and possible change in * ' 
^- ooerating philosophy, qpals and style 1 

, . changes, in priority level assigned to program 

by "society," criminal justice system and ' ' 
/ funding sources 1 ^\ 

id ,v = ■ .. 

. -12- ■ , 



r^^.V' t ohang# in theuiatiire or/ ieveS of ^ ttiS 
. .addrasse^ by the prograni / ' 




'fed 




. .i'r-^', 

•cf#«tipn of ofehfr' institutiorjal eiitii4.e'a- which ■ 
ao^how impinge on the s^stantive 4r,ea ,of ^. ^ 
cencetn .' , ' . ■ ' \ , . , ' 

When changes in any of the above areas are detected, t 
'V ko f'onorded in terBts of a tettopotal sefjuencft «ith .^o^ 




Should be recorded in terBts of a tertipotal sefjueneft «ith .-aoi^e 
attempt at quantifying the degreetof change.- Staff comDleirenKf , 
' budgets and people, courts, e^,. , affected;, ^re relat^ely straight- 
forward nbdrfs pf quantification.. ■'.Changes/in operatir¥g style^n^ay 
be more difficult to qu^tify in rdtroa^|ct b\jt some subjective 
notion that a change waS relatively drami:tic or^not may be . ^ 
^possible. ^ , \ . t 

The purpose of this review of stability (pr ^ore likely » 
iristfability) over time is to determine the anoroo^iate^haracter 
if. a given evaluation task* iWhile analytic aad.,interp^'etiye, pro- 
cedutes. appropriate to dynamic programs are di^rased a-n ^e 
following section (interpretive Pitfalls )i.our ^oemi. here ^ the 
a ntlclpatiQii ' of - problems to .be caused by ch^ging fMgrams *nd 
eliVironmentsj Where -programs are found to /^ange „ l^^**-! "fli 
,,opetating environments) ,the following ques 
rilat_j.ve to the ^mpaot of change on an ev* 



can different "stages' ^ 
of opQi^tion be defined? 

can differant operaJLj 
. - I 



:ange , t^^^ludirig their 

nm OMS^, to be raised 

;^^yle 



typified? 



r 



what variability in 
quality can be antitoi] 
two'Jd i me n s i o n s ' 



sflity and 
j'ss the* above 




arfe Available evajj^atidn^^sonrcei insufficient 
tp .evaluate, cpmpetent^f all the progr^tn- 
)r^ent c^^ina^'lpl" identifie^^ above? 

•' - TO summarize, ■^iS v^i ability of programs and eiwironntents 
oyer time can increase the range of variability^f- wJi^t is bemcr 
evaluatad and can enrich 'an evaluation study. ;M als^can,, how- 
ever, p^yide .toc^ few examples, (whether juri^ictions ,1 neighbor- 
hoods llients, etc.) for statistical analysis or demand more 
resoujails than are. available, ^n the case of too few examples, 
the eva£uator .may ft^^tt to recbiflmend a qualitative 'case study 
approach and 't^e espblishment of a data adquisition system .which 
could support in evaluation in the longer run. In the ins-tance 
in wTiich the'Vange of orteratinq, diversity demands more resources 
than have bee/iT'al.locat©d for evaluatiYe purposes, one approach to 
be cpnsidePed is' ielectlon of one or more program-environment 
configurations wh^'^ poisess, tlie mokt. signi f icancc for policy and 
decisional, purposes. I*utting,^ie^^ovQ two approaches toaether. 



environr 



i 



^0 



^Oise ifiight chobse to w&opt a case 
/their early "leai'ning" ^tage and 
;*'in^ thair ma^ture atages. 



study approach for programs in 
a quantitative study for prograins 



C, Conventional Cjriteria or Goals are Appropriate to Project H 
^' and Proof^ram ^Evaluation / 

\ .When we ^^eak of evaluating programs it is usually in terms- 
of i5me set bf objectives* While legislaition and pr^ram descrip- 
, tions may yield some set of objectives, the determination of^ 
' derating objebtlives their priorities and causal relationships 
^tay b# less than obvious For example, a d'tug abuse treatment 
program may be judged in terms of total abstinence from (say) 
opiates on the part of the program cliGnts, This has quite often 
been the criterion utilized in assessing treatment programs and 
was probably informed^ at least partially^ by the conventional 
wisdom that^once an addict turned to opiate abuse he wouljd once ; 
agaia, develop a v^y expensive habit. If, instead, a drug abuse 
tre4t|k^nt program sees its purpose to be the minimization of the 
cost .of* a client*s habit/ less emphasis may be placed on absti^ 
nencfe than on retaining clients and reducing their levels of drug 
use (and 'hence; presumably , their need to participate in criminal 
activity^ Within this setting, abstinence becomes one of 
s*everal driteria against which the performance of the program can 
be ^ineasured , with cost and cost reduction entering as additional 
criteria. Given the rationale by which drug abuse has been 
^relattd to criminal activity>/ this linkage ought also to be 
assesg^d^if possible. la ©ther wordS/ the objective of reduced 
drug abuse^is instrumental relative to reduced criminal activity*' 
^En bases where the program appears to have succeeded relative to 
drug abuse, has criminal activity been reduced as compared to 
cases whejfe the program appears to have failed relative to drug 
abuse reduction? It is important to note that in many cases ^ the 
objectives of a program may not have been well articuiated and 
the responsibility of the evaluator may include such objectives 
clarification together with the development of their (presumed) / 
aausal linkages* A program intended to train elderly citizens ' 
to protect themselves from criminal assault ctay have the worth- 
while effect of dlftasing the elderly to feefl more secure and 
hence more apt to venture out of their apartments. In addition 
to the objective of reducing assaults upon the elderly, the 
enhanced quality of life enjoyed by those who now feel mote 
secure i.s obviously another desirable outcome. 

Researchers in various fiolds have rpcognized both the 
impprtance and difficulty of causing an organization or program 
to clearly specify objoctivos and goals ^ their inter-- connect ions 
and rela^tyvQ piiortties. In designing an evaluation, these 
issues must bo addreBBod, most likely through the following 
procedures I 



int^rviewi with program administrators can 
elicit operating definitions of "success" 
{e.g,i "What would you like to be able to 
include in an annual report?")^ 

intsrviews with those "on the firing line" 
can determine how they assess their own 
performance.* 

interviews with var ious/s taf f members as well 
• as inspection of job defecriptions , etc., can 
-'assist in determining desired personnel 
chajfacteristics , . . N 

By developing these and other information sources , 




this sort of decision quite explicit. 



ff 




CHAPTER III 
INTERPRETIVE ^ITFALLS 

Given that one has some data in hand and those data have 
^feeen analysed by someone con^etent, interpretation of the num- 
bers is no simple^ clockwork procedure* In this section, 
attention is directed to some of the "obvious" interpretations 
which may prove false. 

The pitfalls and other topics discussed in this chapter 
are all eoneerned with two basic ^ policy relevant questions i 

• what as-e the true sources of program success^- 
including necessary conditions? 

• what are likely or plausible constraints on 
traTisfer or expansion? 



A, "Anything will Work'* for a Great Leader 

In many instances of program evaluation, conclusions 
regaT^ing important influences on program success have empha- 
sized the significant role played by leadership. It is impor^ 
tant to recognize that innovative approaches in almost any area 
may attract certain "innovators" who radiate some particular 
charm and dynamism {perhaps charismatic)^, fience, pilot programs ^ 
demonstration projects^ etc. ^ may be quite successful solely 
because of the characteristics of their leaders (i,e,, the 
structure and mode of Ofjeration of the proq^ram may be irrele^ 
vant to success). But such leadership characteristics may not 
be available in sufficient supply if one des\res to implement 
such programs on a large scale. Furthermore ^Xshould such mas^ 
sive implementation be undertaken, the dynamic innovators may no 
longer be interested and those who stay may "burn out^ ' losing 
the effectiveness which initially caused the. pilot programs to 
be successful. The evaluator and the consumei^.' of ' the evalua^ 
tor's work must consider botJi the potential "leader effect" and 
the question of replicatlnq that leader to expjand the popula- 
tion served by a given t yp<* of proqram. 



' In order to investigate this potential problem it is impor- 
tant to gather information regarding leader or director charac- 
teristics including style of operation, educational attainment, 
work history (including level of job turnover), personal interests, 
etc. Two simple questions can be askod of the data: 

do all tho proioct directorB of n certnin type^ 
of project have nome thinfrs in enT!i!!^fi? <Th# 
commoni^lity could be some thinq abstract such as 
eclectic internets or atypical Qccuuational / 
^ educational historian . ) 

how much of the v_u lability- in pro i^^t^^^succoss 
can be associated with • proinct directors^ 
characteristics? i 



PrcBCj i^ti\'e Cliuck 1 i.^t 

1. Collect informat 11)11 deHcriptivc of each proqram leader 
in terms of backqround, cxccuLivm >;tyb . r -rating philoRophy 

( i f poss ible) . > 

2. Compare prdcji\im leaders to determiiu: any cojnmona I 1 1 les 
(e.gt, do they all have unufmril caiet^r histories?). 

3. Is there anvthinM about the leader population which 
makes them '*odd birds" unlikely to be available in sufficient 
supply to suppoVt proqr.m expansion tenfold? 

4. Is there any [indication that those who have held thn 
leader's position Ionq4^Ht are experiencinq reduced effectiveness 
or are considerinq loavinq? 

5. Is there any relationship between leader characteris- 
tics and proqram effectiveness? 

; \ 

B- Cross "Culture 1" Transfer Can be Problematic 



We use the tt^rni "cu 
to describe thoBO traits 
ethnic and other social 
What is projIc^stHi is tht* 
various programs dtquMidn 
termed "cultural." In .\ 
desirable to not<* for wh 
appear to have maxnuufn \* 
the cHiltural elements on 
ticularly in tht^ .ww of 
under conditiunb o' cult 
for a middle class, whit 



f 

1 1 u r e " in a v r y b r oad ^ inclusive ma nne r 
, practices and attitudes which vary for 
q roupi nqs ( i nc lud inq social classes), 
pi mciph^ that the effectiveness of 
on certain conditionH which may be 
^lS^>H^ I nq >i pr C) ject or jjroqrani it is 
uin and in what areau the operations 
f I (kM . Mf )rcovt^r , some uncier nt and i ntj c)f 
^huMi a protfram (iepcMids is (ic^sirecl, par- 
a prcapHM or prnc|ram which hctn operated 
iiral hnmf)qene i t y , In short, what workn 
urban populat ion (rwiy, a divf^rsicni 



■■-1 

o 

ERIC 



prograin) may not be effective without Tnodi f icat ion amnng rural 
native Americane, The evaluator should expend reasonable efforts 
to collect data concerning sub-cultural attributes of a pro>ect 
or prograni's target population and to make note of plausible con^ 
nections between such traits and a proqram's mode of operation 
and relative success. % 



* P r e s c r i p 1 1 V p C h c k 1 1 s t 

1. Collect 1 n f o rrn.i t 1 on desc^r i pf i Vi* of o'ultural background 
of " pa r 1 1 c 1 pa r i t H " (i.e., M. d f t , c I i e n t h , t a r t| e t f >o p u I *i t i o n , etc.) 
including Facial class, u r ti a n - n uh u r}^^ n ^ r ' i t />*^ ^ r.i c i a 1 / e t h n i c 
characteristics, 

2 . Are the cult j r ti 1 c h a r ^4 c t e r i s 1 1 c s r u 1 a 1 1 v e 1 y constant 
ordivers©? 

^. To the dct;r*'t» thi^?' is oulturtil divers ity, are any 
cultural t^lements as soc i .i t j -d with [-rug ram pe r f ui mance ? 

4. To the dvcn fH' tht-r*' is 1 i t t 1 n c u 1 t u r a 1 diversity^ can 
you identity possit)le | r^unari c/hia rac t r-r i s t i cs (or elements) which 
are "culture bound" (i.*'., wi^uld likt^ly require modification in 
another cultural cN)r;*^cxi ) 

C. Ojjeratincj En v i rf">nm»»n f s t'^hanne 



Just fis a proqifiin's tMti^ct i vt- ofH^ration may hv contingent 
upon sonit' cultural traits arur^rui the tarqi^t population, so, too, 
a program niay h(* surct^sstjl within a ct^rtain operating environ^ 
ment but iu){ m others. It the avanat>Llity of street heroin is 
curtailed through Siomr othei mtKr-tian i s.m , a drug abusr* treatment 
program may havu ati i-ns^iable r ecu id ut recruiting and holding 
client>i. Should Dpiatt's aqain bt^f^miu rt^adily availahle on the 
stret^t, thu erivialsli' rtM^iici ruiy btH.;omt* history. While* this 
i*xamplt> drt'W on ari criV i uinu t a 1 fai^trrr (availability of illicit 
druvib) whii'ti iAii att^'Cted by *'!f?^rt.s ot variouH components of 

the (^rimnial ] u s t i < s s t t ^rn , ^t ; r onv i i tnimt^nt a 1 facMorP may not 
L>e sf ) c/( Ml t I < U h i h 1 . 1 

tu ■ n a t n ) n . ^1 e t m > r i oin y a i u 1 w ? m t h * ' r a ra ^ two I a c t < ) r w ti 1 h 
imfuict on var O'us pt u|r ims. l\!r ••K.utiple, c m ininuin i t y- ba sed 



CO t I t - o t 1 on h [ > r ! ! unis o ! t * ■ n f * k f ^? » r i rn c* ^ I ow d r t ipou t r a t « du r i rig 
Winter inonths arni h i ghtM rat*^s dur irig the summer. Himilarly, 
dur itiq pr'ricxh^ o! JUMirusmM' ^^MS*s^;lon iJioperty f^rinies ofttMi 
incr*'as»«^ Su(4i facMors art' moM- than statistical "pToblems" to 
bf (hs! I t wi t h an.i 1 y t i i M 1 1 y , f )\ t ht-y a 1 so ragu csont i ea 1 -wor Id 
f a o ! which 1 riip n nji i>t\ op* ■ i a t i c >h.i I pf o* j r am!; , In t h»' t m wti of 
'»**'e»ofwiI e'fltMMs, i^xaniplf, pr t >g r anufu ng oj olMsits miglit we 1 I ^ 

taki' tht^^.o trots intii ( -on s i d< • r a ? i on and, Mi r t h< • i mr > r < ^ , different 



,1 H 



\LLnim of co«iiunity based correct ions proqrama might deemed 
mpproprUt# for mxm belt states in which the inducement harsh 

weather is absent. 
* 

Prescr ipt i^e Che ck! is t 

1 Col^leet irifor^atlOR eofie^rninq the op«r«tinq mnvixon- 

Bent of the program, specifically those environmental faptora ^ 
.^ich are both subiect to chanqe and are though^potGntially , 
related to program functioninq. 




I Relate envi ronmentci 1 information to p^M^m performance 

information--both across pronrcimB .ind for sinqle frnqtams over 

time. 

* • 

3 Where data arc insufficient for the above sorts of 

analysis, it is esspoci a 1 1 y i mpor t ,in t that plausible conjectttre 

be andertaken in this n-qartl. 



D; If a Little Bit Wuiks... 

Quite frequpntly, a aiven frcqrcim type is tested and 
evaluated in terms nf a piutotypu or demonstration proiect (quite 
appropriately, by the way, fr,r t<u, often broad quaqe social 
proqrama have been iniplemented on a very large scale with little 
or no evaluation of their e f f ect i veni-ss or consideration of their 
unintended consequences—w 1 1 neas tho\number of hiqh rifle slums in 
our nation's cities). Assuminq the ^ntntypo project is 
evaluated as relativelv surceHS f ul , planners and policy formula- 
tors may feel justified in expandinq the pr qram. Some thought 
should be directed, howev.T. to the vaHous ways m which the 
prototype's small size and unique status may p«Ftlally explain 
its success, such that this level of success can not reasonably 
be projected to a qreatly exp.inded praqraT«. 

The "Hawthorne effect" ib well known in social research. In 
its molt yeneral Benae, the term refers to the effect of exposuro 
to a relatively unKiue sifuatujfi (includinq the prosence of 
researchers askinq qut-stionH) which can have Biqnificant impact 
on results {m the onqinal Hawthorne study, productivity of 
workers in a Western Klectric assembly facility was the obiect of 
interest) That one is pa r t i o i (.a l i nq in "somethinq sp.ecial can 
have remarkable eft.-ctH on the Kfatf and others involved with a 
progj-am. This Hpecial r.tatuii will no Icnujer be an attribute of 
the program when it is qteatly expandeil and 'hence the exp.anded 
program cannot be prMierted aw a -n m| u> expansion with simple 
multiple benef its. 

A -prototype priHiram can be seen am a Hmall fiactor within a 
larqer ByBtem. If intenHlVe .Mime prevention techni-iUeM Are 



imposad upon k ralativaly small geographic. area ^ erimi may 
rddueed within the target area* However^ the crime reduction may 
In part hp a refleetlon of displacamant of criminal behfvior to 
afeas outsld#^he target zone* Again / results from the small 
program canndBba projected e imply to a proposed larger program* 
Similarly, tHv existence of one relatively open correctional . 
facility within a larger^ syetem of other corrections facilities 
presents problems of analyiis and interpretation. The success of 
the open facility may, in part, be dependent on the tacit threat 
represented by jthe continued existence of stricter institutions 
to which offenders can be transferred for infractions of the rules 
In short, the strict Institution may be necessary to the success 
of the open institution* Should an entire correctional system 
be, transformed into totally open instutions, one would have Uttie 
basis on whidh to predict system success^ from the experience of 
the single facility* 



Prpscriptive 'Checklist 

1. Is the program a prototype or otherwise relatively 
unique? 

2. Is there a sense of participating in "sdmething 
special" among relevant actors? 

3. Assess potential for "displacement^" etc, 

4. What is the relation of the prototype program to 
"main stream" programs? 

5. What are other problems associated with broad scale 
implementation? 



E, Individuals Are Not Groups = ^he Ecological Fallacy 

' -^--^ 

Social research analysts, criminal justice system analysts 
included, often operate with several units of analysis. On 
occasion, the units -may be individual persons^ a% other times, 
census units or other geographic areas, and at others , programs* 
All of which is well and appropriate except when the differences 
between these units and the ways in which they are^^or are not^^ 
related' to each other, are igiored. If one determines the 
relation between the median persoHal income of neighborhood areas 
and the proportion of children within those areas in need of 
youth serv^es , one has not determined the r elationship of thos e 
two variabfts for families or citre¥ or anyt^i lig other than , neigh - 
borhoods * Whereas m^ian personal income may Tell us a^great 
deal about a neighborhood in terms of residential mobility, 
youth culture, availability of various amenities, etc, those are 
not attributes of a family with a given income (residenoe in a 



neighborhood with a given median incom« U an attribute of a ^ ^ 
given family and that contextual attribute Aay be of significaneo 
in underitjsndinf the behavior of the members of the family, inde- 
Mmdent^t that family's income) . The problem diacuised, that of 
i^Kibutint group level findings to individuals, is known as the 
**^^6e).logic5t- fallaey." That is, what is true of the neighborhood 

hot true of every individual or family in the neighborhood, 
this fallacy has a complernentary cousin which is sometimes ■termed 
the "fallacy of composition." This second fallacy entails pro- 
jections from .individual level findings to higher order units (or, 
more generallyV the projection upward from smaller units to larger 
units) . An example from outside the criminal justice area, which 
is hypothetical, but plausible, is the followingi 

persons enjoying higher incomes are exposed to 
lowtr levels of air pollution than those with 
lower incomes; but 

areas with higher median incomes have higher 
levels of air pollution. 

The first, individual level finding, relates to the ability to 
avoid pollution which higher incomes enable individuals to under- 
take (i.e., residing in cleaner suburbs, etc). The second, area 
level finding, relates to the association of polluting industry 
with income generation. Thus, if a policy maker looked at the 
individual level finding with a desire to reduee the level of 
pollution, the resulting policy could be absurd. Similarly, a 
program which focuses on individuals does not necessarily hava 
the collective impact which might be inferred from individual 
/level data. 

All of which is to say that conclusions based on data on one 
unit of analysis can be transferred to another unit of analysis 
only with great caution* The '^great caution" term should be under 
stood, however, insofar as eKtrapolation is possible when accom- 
panied by some model (or at least underltanding) of the different 
mechanisms gg^rating at different leveiyof analysis and reality. 

^ Prescriptive Checklist 

^ b . ~ ^ 

1. Are all variables approgitlate to the same unit of 
analysis (person, neighborhood, etc.)? 

2. Are all relationships stated at the same level of 
analysis? 

3. Note appropriateness of contextual analysis in which 
collective attributes are assigned to individuals (e.g*, type of 

^ neighborhood can be used to describe an individual's experiences, 
resources, etc.) . 

4. Specify mechanisms which serve to explain the relation- 



-21- 



•hi^F* apong viMrlabies at dlfferant levali of analysis. 

/ 



Constrained Populations s Selective Recruitment and Diffei^en*- 
tial Attrition 



Whara special situations obtain, extansion of findings (and 
my0ti the validity of conclusions) inay be questionable « "Regres^ 
si^ii effect" is a technical^ statistical term which refers to an 
Oft obsarved facti the moet extreme^ are about to become less 
a^ct^rama. This, by the way, is not a universal truth (the oppo^ 
si^a phanom^iipn^^auto* correlation or positive feedback-'^covers 
th# apparant truth that success begets success and failure begets 
ii^lure) » RafraMlpn affect, put most simply, assiunes that a 
portion of any owerved measure is transitory (e*gp, the heaviest 
ptrion in a clara Is heaviest on the day of weighing i in part, 
b^^aui^that person has been on a recent eating binge and/or 
sjtippifa normal physical activity; and that this person 1^ about 
to raturn to normal activities) , In the field of criminal 
j^0tiea evaluations, a classic example has been offered by 
Ctmpball,^ et al. ^ / 

Tha important point is that analysis based on extreme cases 
ne^di to be Informed by possible reasons for an observed change* 

The selection of cases for inclusion in a program^ whether 
individual or something as large scale as an overall progtam for 
hig^h crime areas, can impact upon the ability of a planner to 
a3€fc^^^3 those findings. A program im|3lemented in extreme cases 
(hp^fivar defined) is' operating *in a rarified environment, Whan ^ 
the program is extended to less extreme situations, things may be 
. Vary di £f erent • Where a problem is extreme (whether in the indi- 
vld^^l case or the community) practices may be accepted and be 
effective, whereas in a less extreme caBe, the same approach 
niight be neither accepted nor effective. 

The problem of differential mortality or attrition of program 
participants is another means by which obierved results can be 
niipliading. For example, in many instances, some form of 
"su^oess" removes a case from further participation in a program, 
Thi^ results in a potentially significant difference between the 
competition and characteristics of program participants at a 
given point in time and the compoaition and characteristica of 
tho^^ entering and those exiting the program. The removal of 
aue^^tses can have substantial operational significance which 
goe^ beyond the problems of an evaluator as narrowly defined. 
Further, if successful program directors tend to move upward or 
oth^^^i^© leave the positions in which they proved themselves, 
there can be obvious implications for program functioning , The 
evaluator who spots such a tendency should be prepared to doeu- 
nientlt^ spell out the operational ramifications and suggest 
nian^pment procedures to deal with it, ^ 



-22* 



■ ■ f . 

^ Preggriptive Chaekllst 

1. Selection processes need to be desGribedjfelative to 
program partiqipation-*are we seeing the beati or the worstr of 
some situation? 

2. Determination of alternative means by which "casei" can 
disappear from a program* 

3. To what degree can the above conditions change inter^ 
pretation of results? 



G. Absence of Total Rigor Totally Invalidates the Results of a 
Study 

While the thrust o£ this manual is its orientation with 
respect to prbblems and pitfalls^ it is not entirely gloomy. More 
than one federal policy maker has been heard to express a desire 
for a "one handed evaluator"' because of their exasperation over 
evaluation studies which conclude with the form, "On the one 
hand* * . • but on the other hand* * . • " 

"^jjben those studies which received very large levels of 
support in order to achieve def initivenass have not always been 
successful. At the conclusion of data analysis too many evalua- 
tive studies are flawed by an undue modesty due to perceived 
methodological inadequacies* The policy maker is interested in 
something which has relevance to decisions. Seldom is a study so 
flawed that nothing can be said^ — although this possibility should 
have been considered while completing the "Requirements Chack^ 
list*" Indeed, where the evaluator feels that almost nothing can 
be salvaged because of some "fatal flaw," it would be advisajble 
to return to the Requirements Checklist to repeat the exercise. 
Elsewhere ("Capitalizing on Adversity") problems encoiinterad in 
the conduct of an evaluation are discussed as unanticipated conse 
quences. Here, however, our concern is with addressing the 
issues about which the evaluation originally anticipated develop- 
ing information. 

The evaluator may be able to document that a program has an 
impact in the intended direction without being able to documant 
the precise ways in which the impact is effected. Competing in- 
terpretations (e.g., Hawthorne effect, seasonal effects) hava 
bean discussed as cautionary notes. No matter the reason, a 
program may be said to work, while recogniiing that a program is 
a complex and dynamic entity. Underitanding the limitations of 
extending program findings has been diseased at length* Here^ 
instead^ we emphaslie the need (with all appropriate provisos) to 



-2 3- 



report the actual, amplrical findings • 



An asflOQiatad iesua arieas concerning the range of observa- 
tions available to the evaluator* The more measures we have 
available for independent analysis , the more certainty we can have 
concerning conclusions. For example, the Westinghouse Justice 
Institute^ prepared a Summary of Parole Enhancement Programs' 
Technical Assistance Needs and Problems* Fourteen dimensioryp 
relative to program management and operation were assessed for 
each program. Although the purpose of the Westinghouse survey 
was the determination of Technical Assistance needs ^ it could be 
interpreted as similar to a part of an evaluation. While the %^ 
evaluator must beware of drowning in a mass of data^ the avail^ 
bility of different sub^elements or components of an overall 
concept, all (or at least most) of which point in the same 
directioni enhances the credibility of analytic findings. This 
approach is somewhat akin to that involved in repeated imposition 
of study designs across different populations, except that in the 
current case variables or measures , rather than populations, are 
varied. 

Finally, the evaluator should recogniie the need^ of various 
consumers of the evaluation. Whether or not the given program 
can be said to ''work" or not, various independent findings may 
be of interest to policy makers. Hopefully, the evaluator can be 
sensitive to the needs and interests of the various consumers of 
the evaluation such that various unanticipated findings can be 
appropriately communicated • 



-24- 



9 



CHAPTER IV 



CAPITALIZING ON ADVERSITY 



The svalumtlon resaarcherf monitor and daoision making 
evaluation ooneiroer all bring different perBpectives to the 
conduct of an evaluation. Here we are interested in exploring 
the uses of factors in the evaluation situation which the eva- 
luation researcher may view as troublesome. Our primary conten* 
tion is that, too often i evaluators adopt a certain sort of 
tunnel vision in which purposes are very narrowly defined (as 
if a gold prospector were to become infuriated because his pick 
were dulled by hitting a two pound diamo^) , What we offetf 
here are exaunples of a more general phenomenon. It is hoped 
that, through discussion of these instances, an appreciation of 
the more general principle will be nurtured* One flormulation of 
the principle would bei 

if you encounter a "problem" which was unanti- 
cipated, there are probably many Others involved 
within the criminal justice system >^ho don't 
know about it — -and you are their eyes and ears., 

L ^ 

A, Deviation of Programs from Legislative or Planning Speci-- 
fications are Valuable for Policy and Planning Audiences 



The deviation of programs from legislative or planning 
Epecif icationa is troublesome to the ©valuator in that the 
evaluation's primary mandate was to determine if Program A works* 
Thus, when the evaluator discovers that various programs called 
'•type A*' vary significantly from enabling legislation, etc., the 
evaluation, of A-type programs is in difficulty. Obviouslv, we 
cannot answer the question, "Does A Work?" when we can't find an 
A. On the other hand, two new questions arise i 

• do the variants of A show significant differences 
in terms of effectiveness? 

• why do the operational programs deviate from the 
legislated programs? 



Il the oas# oi the ^r^t question, one 1$ exploiting the ' 
"natufal" variability among programs. The variability among 
progr^ina mai^will dilute the statistical power of certain "intended 
analys^p but the evaluator is to study the effectiveness of pheno- 
mena^ As ao example^ assume the evaluator is to study the effect- 
ivenesi of therapeutic conununities in community^based corrections 
prograroa for' youth* Whereas the original evaluation design .anti^ 
cipated homogeneity of "therapeutic communities" the reality 
encountered is one of great diversity from "therapeutic-less" 
residenttal ; facilities to programs with intense, confrontational 
encounters^ little "free time" etc. Whereas the number of cases 
exposed to "identical" treatment has been reduced, the range of 
treatment types to be analyzed has been expanded. Thus, Program 
hi can be contrasted to Program Aj . While the original design 
has been compromisedi the intent has been enriched. Moreoveri if 
no differences among the variants can be identified with respect 
to effectiveness (nature of clients taken into account), then one 
may have discovered that the so-called treatment activities are ^ 
irrelevarit and that something else, such as "residentness" is the 
crucial treatment. All of which is speculative here, but our 
point is that the evaluator must be flexible and ready to listen 
to the data when the unanticipatei occurs. 

An institutional question is suggested by the deviation of 
programs from specifications, as mentioned above. Is it because , 
program sta-f f believe they have a better way? Or is it that some 
resources ' presumed by the specifications are not available? In 
the latter case, it may be that certain skills are not available 
in the work force which can be recruited at specified wage leyels, 
etc. Investigation of the "Why?" question with respect to program 
deviation can be of very real assistance to program managers and 
others , 

t 

A final question which can be raised in this event has to do 
with whether the deviations can be considered disrupt'ive to the 
original policy objectives. This may well entail a relatively 
subjective judgment (although supported by factual observations) 
but could prove as valuable as more "objective" findings. 



B. . That Programs Are Dynamic May Indicate Modes of Institutional 
Learning Which Can Be Transferred 

When programs change their mode of operation over time, the 
'>» evaluator ' s undertaking is complica1/6d in much^ the same way it 
was in the preceding example* Onceiagain, however, we are given 
the opportunity both to study a broader range of program varia= 
bility/than anticipated and to learn something about the dynamics 
(or life histories) of programs of certain types. Since the 
first question is of the same variety as that discussed within 
thp precseding section, we turn directly to--the question of the 
|volutiornary dynamics of programs. In the'' case of community anti- 



Qxium p^ievsm, ^Kartpit, it! la pmifimQ%ly tpMnablm to' ^ 
•Kp«W •V0ls«l6n (andi parhaps^ Mediation as wall) and tm 

•^iearnihg purv*»* sviah p^ograini. a« things which ought to be 
und^stoo^/not'fiuy ternw of raiative mtimcti^Bnm&m, but 
pointi at which tfp#ci£la forms o£ ^fis^itaheeMnight prove parti- 
quiarly baneflcial,* Wiile aDnductlng the traditional evaluation, 
it ip advisable to maintain chronologtcai. records oohcerning 
erganiMtionalf dynamics of ^he^sprt appropriate to a , case study. 
Again, while the •volution ^f thm program is an unanticipated 
event, it provides both a finding well as an opportunity ' for 
additional reserfch topics of direct policy relevance. Further^ 
A^re, the modes of evoluition of different programs can be con- 
trasted and some assessmei^ of relative costs and benefits among 
the'-alternative evolutionary modes can be made. Th# evolutionary 
farm adopted by a given program is not necfcsaarily the *b#st one 
and this asses^ent can prove invaluable in assisting new programs 



in the future* 



Attempts to Conceptualize and Operationalize "Appropriate" 
Objectives and Goals Can Impact Planning and Legislative 
L^ngu'age and Procedures 

Goals and objectives of programs are often enunciated in - j 
extremely broad, general terms, ^;An evaluator, on the other hand» v 
requires that measurable objectives be specified* Ohb of the 
evaluator 's frequent tasks, therefore, is to work with program 
staff and others in developing observable and measurable trans- 
lations of their broad-guage gial and objective statements. This 
effort can prove productive for purposes which go well beyond the 
conduct of the evaluation. For eKample, a program designed to 
reduce criminal exploitation of the elderly might mention: 

enhanced safety of the elderly in their ^ 
neighborhoods 

enhanced safety of the elderly within residences 

enhanced sense of security of the elderly, 

relative to criminal attack ^ 

Alternative approaches to these three objectives are avail-f| ^ 
able, both in terms of program tactics and evaluation measure- ^ 
ments. As the evaluation staff works with the operations staff } 
in translating these objectives into a set of measurable indica- 
tors and articulating the assOTied or hypothesized relationships 
among^^e objectives, new insights can be expected on the part of 
the operations staff* For example, it will often be the case that 
objeGtives become elaborated inio sub-objectives with a logical- 
temporal sequence- In this way, tK6 evaluator 's demand for some 
clarity about evaluation criteria can become a useful stimulus to 
program staff to clarify their purposes and the instrumental 



iii#ana by whleh their ebjaQtlves are to be falnad. 



1 

D. Pailure to Find a ."Psaudo Control Group" ii Itself a Finding, 
Perhaps Relative to Recruitinent Efficiency 

.Program evaluations are often undertaken with the presumption 
that a **pseudo control group" can be identified, Whethep^he 
analytic units are persons, neighborhoods or other entitieii the 
design is founded on the assianption that we can find a group of 
unite, similar to the "treatment group" with the exception that 
they have not been treated. In the author's experience * this 
aBBinnption Is often not nrnt (as discussed earlier) . While this 
is troublesome to the conduct of the evaluation (as designed) it 
constitutes a significant finding with respect to program func- 
tioning and (wltii respect to programs impacting persons) recruit- 
ment or organiiations (with respect to community programs) , This 
is not to say that the progr^||k^ effectiveness has been evaluated 
but something of worth has pWL determined. ' 

To suMiariie this section, when the unexpected throws a 
monkey wrench into an evaluation design, that which is unexpected 
may constitute a finding and may also offer the basis for a 
revised design. Again, keep in mind the numerous audiences to 
be served by the evaluitor within the criminal justice system* 
Disappointing news to tihe evaltfation manager may be , important • 
input for some policy qfaker. 



r 



I 



NOTES 

1. •'Quick Evaluation Methodology," 1973, Special Action Offict 
for Drug Abusa Prevention, Executive Office of the President. 

2. Dani©l Glaser, Routinizina Evaluationi Getting Feedback on 
Effectliieness ol Crime ana Delinquency Programg , 1973, 
Nationaf Institute of Meptal Health Center for Studies of 
Crime and Delinquency i tockville, Maryland, 

3. Public Policy and Evaluation Research i A Perspective on an / 
Art, i^li, Office of the President's Science Adviser, V 
ITcence and Technology Policy Office, National Science 
Foundation I Washington, D,C. 

4. Charles J. Hitch, "On the Choice of Objectives in Systems 
Studies I Santa Monica, California i The RAND Corporation, 
19S0* 

5. . Aaron Wildavsky, "The Self Evaluating Organization," Public 

Administration Review , fleptember/October , 1972. 

6. D, T.' Campbell* and H. L. Rose, "The Connecticut Crackdown in 
Speeding I Time^^eeries Data in Quasi-Experimental Analysis,' 
Law and Society Review , lllil, Pp* 33-53. 

7. Westlnghouse Justice Institute , ^ummary of Parole Enhance- 
ment Programs' Technical Assistance Needs and Problems," 
Contract N^aber J=LEAA-003-76 , ^ 

8. Discussion of various designs is available in Intensive 
Evaluation for Criminal Justice Planning Agencies , Washington, 
D.C., National Instit«te of Lb^ Enforcement and Criminal 
Justice, Law Enforcement Assistance Administration , V.B.\ 
Department of Justice, 1975, pp. 5-7. 

9. Harvey Averch, "Public Sector Productivity," First Annual 
RANN Symposium , Washington, D.C., 1974, pp, 

10. .^Discussion of some of these approaches /is available in Stuart^ 
Adams , Evaluative Research in Correct^nsi A Practical Guide , 
National Institute of Law Enforcement and Criminal Justice, 
Law Enforcement Assistance Administration, U.S. Department 
of Justice, Washington, b.C, 1975, Chapter 9. 



3. 

.29- 



SELECTED BIBLIOGRAPHY 



THE CONTEXT OP EVALUATION 

1. Carol H. w#laS| "Where Politics and Evaluation Research Meet," 

Ivaluation li3, 1973, pp. 37^45, 

Excellent traatmant of the differing perspactivas # 
allaglancas and needs brought tb bair on the evaluation 
proaMB by various actors in the evaluation arena* 

2. Harold L. wilonsky. Organisational Intelligenca ^ New Yorks ^■ 

Basic Books, 19677 

0 

A good discussion of the general functions and relations 
of knowledge and policy. Evaluation is not traatad as 
a separate fonn of ^intalligenaa*' but the reader 
intarastad in evaluation can place it in the context 
supplied. 



SOME TECHNICAL ISSUES 



3, Frank Andrew, et al., "A Guide for Selecting Statistical 

Techniques for Analyzing Social Science Data," Ann 
Arbors instituta for Social Research, University of 
Michigan, 1974. 

A helpful aid for the. non-Btatist ically oriented 
individual to undorstand the process ^"whieh approp- 
riate analytic techniques can b^ selected for a given 
body of data and interpretive purpose* 

4. Ilene N. Bernstein, ed* Validity Issues in Evaluative 

Research," Beverly Hills^ Sage Publications, 1976, 

A collection of essays treating state-^of = the^art with 
respect to selected issues in evaluation* The chapter 
by Alwin and Sullivan, *'Issaes of Design and Analysis 
in Evaluation Research, ' is a lucid treatment of issues 
surrounding nonexperlraental and quasi ^experimental 
designs. 



LOGISTICS } 



5,^ A ^Guida for Local Eva luation, WashinntOB, D,C, ■ Department 

of Housing and Urban DevelopmGn t 1976, (Available frort 
Superintendent of Documents , U , S , Government Printing 



Of £iaa,»Mhingttm^ D.C. 20402, stock niu^er 
0^3*0^^0327-9:):? 

k gui*dt^tb conducting evaluationa organiied at a aerli^ 
of raadings covering admlnisferativ© and logiatieal 
lasuaa aa well aa **m©thodoloqical" concerna. ^ ^ 

6. Eve Weinbarg, Coiranunitv Surveys with Local Talyt , Chicago i 
National Opinion Research Center, 1971. 

Thia manual contains a great deal of TnateriaT wlAh 
respect to the details of running a field operation** 
interviewer identification cards and carrying caaes, 
sisa of interviewer training groups, quality control 
and payment procedures. Especially helpful are sample \ 
forms with respect to the several Stages of a field 
survey from interviewer recruiting and simpling to 
record keeping and quality control. 



wmm 



CRIHINAL JUSTICE EVALUATION 

7. Daniel GlMer, Rbutiniilng Evaluationi Getting Feedback on 
Effectiveness of Crime and Delinguencv Programs , 
Rockville, Marylandi Natiooal Institute of Mental 
Health/Center for Studies of Crime and Delinquency, 
1973. (Available from Super intendeht of Dociunents, U.S, 
Government Printing Office, Washington, D.C,^ 20402, 
stock number 1724-00319, y 

h well prepared and thoughtful ion of the "whys' 

of evaluation, including policy relevant considerations 
of what Btatletici and what comparisons are appropriate 
given a specific policy question. 



-31- 



3, 



ERIC 



i 



APPENDIX A 

DISCUSSION I Variables 



A variable, for our purpoaas, is aomathing wa oba%rv# and 
for whieh wa can eharactarifa ditfaraiieaB or variationa* Tha 
alsplaat sort o£ variabla ia a ■dichotomoua atjtributa" corapoaad 
of only two^ategorlaa* For axampia, tha govarnlng antlty haa 
or haa not ilii^tutad a fivan programi a priaon ^alaaaaa aithar 
doaa or doaa not raeldivate within ni% montha of ralaaaa, ate. 
Diffarant writara uaa aofnawhat dlf f•rant^ voeabulariaa to diaeuaa 
claaaaa of variablaa and the following traatmant will attpmpt to 
aarva aa a moda of tranalation acroaa tha iavaral tradltiona 
which giva riaa to tha diffarent vocabulariaa , 

Explanatory Variables > 

Explanatory variables are those which ara used aa the basis 
for developing an explanation of tha variability of other 
variablaa. Several sorts of explanatory variablaa may ba ancoun- 
tared. An indapendent variable (which may alao be termed a 
pradictor variable, or a design variable) ia the basic type of 
axplanatory variable. In the following statementa, "A" fills, 
tha apace which would be occupied by an indapendent or predictor 
variable I 

• recidiviam is positively predicted by A; 

^ • the higher the median > of a police force, the 

lower the response time; 

the A of a conmunity is not predictive of the 
level of assaultive crimes reported. 

Note that in the latter case, "A" fills a slot for an independent 
^ variable even though it is said to be ineffective as a predictor. 
This point is important; to say of a variable that it is "inde- 
[indent" is to indicate its location in the logical sequence of 
analyala without regard to its actual effect. .Mori^var , a given 
^ari^la My be Indapandant for one step in an analyaia and aoma- 
thing alaa in another-^more of this in a moment. Typically, the 
uaaa of the tama, *• independent- and "predictor," in this regard 
ara Idantlcal, with the following proviso i X \ 

i, ^> 

-32- 



ara dMeriptiva of traatMnt eonditions or 
l«^ia diatlnGt from "varlablaa of fMASure* 

• In thm ammm of nQnHMp«rim«nfct # •ueh as aamplc 
aurvaya, tha tmrm iTia\ b« ua«d to daacriba tha 
aanpling proeaduraa a^ appliad ovar dlffarant 
populatlona (for aKamplai urban va, rural). 

In any avant# it ia important to keep in mind that aaoh Of 
,^i)aaa tama ia aynonymoua at a fairly abatract laval*-that ia, 
thay ara aKplanatory and hanca they precede gertain other kinda 
of variablea in the cauaa-effact loqic of tha reaaarch. It 
ahould alao be reeogniiad that multiple independent variablea 
ean b#* mr^ often arei Intraduwl alurailtan^oyaly . iuah «n#lyaM 4r 
are norMlly termed, "multivAr iate . " a 

Intarvening Variables 

Intervening variablea conaituta another claaa of eicplana^ 
tory variablea and have direct relevance for program evaluation. 
A program mmid to have oblectivea which promotiy the attain- 
ment of go^a. Thua, if "A** repreaenta a program lavel of 
effort (or performance, «tc,,) repreaenta a^ievement of 

aome objective and rpprpsnnts attainment of qoals. We may 

aaki 

Does A promnff* p? 

• Does B promntr C? . 

• Does A promote C 

Consider the followinq sf^trmnnt roncernlnq the Indian Health 
Service and its re i a t innsh 1 1^ to luvenile delinquency: 

fnsofar as tho [^rnciram treats Indian youth 
for mental and emotional disabilities and 
for drug abuse, it addresaeB factors believed 
to cause delinquent behavior. 

In this case, some measure of treatment is the independent 
variable, the client's mental and emotional status is the inter' 
vening varibale, and the delinquent behavior is the dependent, 
variable — ^that is, it is the result which is to be explained by 
the explanatory variables. 

Of particular interest to the evaJyator, in addition to 
asking the obvious question reqardinq the relation 6t the inter- 
vening variably to the dependent variable, is the relation of the 
indapandMt variable to the dependent variable apart from that 
due to the intervening variable. In the eKample above, there may 
K^aonie influence on the level of delinquent behavior which does 




not oparate through tha intervening variable identified. 



For example r the exposure of youth to a certain type of 
adult role model may result in altered^ career aspirations and 
longer temporal orientation. This in turn may reduce the 
desire for inmiedlate gratification through delinquent activi* 
ties. If thie were to prove true, the ramifications may be 
more profound fo^^rogram planning than any results which eva^ 
luate the prog^^without going inside the blackibox to under^ 
stand theBJ^cesies which are operating- Put eiAply # . if the 
adult rer^emodel were to have appreciable Impa^^ slgnif leant 
L economiis might be effected by delivering this "treatment*" 
Njhat iSf exposure to a certain type of adult role model may be 
f to-n T iui e cost-effective than "therapy" for the'^bulk of the popu- 
lation at risk* 

Interaction Effects 

Interaction effects are frequently encountered in the. con^ 
duct of evaluation research. They occur when the joint effect 
of two or more explanatory variables is other than the simple 
sum of their individual effects upon the dependent variable . 
Numerous terms exist in the literature to descOT.be variables 
which behave in thig manner. Among the mora Mmmon. terms arei 

) 

• intensifier, or catalytic variable 

• suppressor variable 

multiplicative (as opposed to additive) effect 

No matter the name used, this situation poses Interesting prob^ 
lems, particularly in the Instance in which the Inyestlgator is 
unaware of one of the interacting variables- In this case, if 
the unrecognized variable has an "appropriate value" another 
explanatory variable may appear to have no effect, or, if the 
unrecognized variable achieves a different value, the explanatory 
variable can appear to have a very strong impact on the dependent 
variable. In the case in which studies (or components thereof) 
seem to differ in terms of the effectiveness of a program, it 
may be that the interactive phenomenon is operating. Thus, one 
needs to search for what distinguishes the successful programs 
from the other programs and thereby hope to detect the other 
variably in an Interaction effect. 

Dependent Variables 

These represent the phenomena to be explained by the 
explanatory variables already discussed- Dependent variables 
are also sometimes referred to as " Literion" variables. In any 
event, the research question is clecjy * 

Does the nature of what we observe when we 
measure the dependent variable depend on 




3 



the nature of what we observe when we measura 
the independent and intervening variablee? 

Once again^ a variable is a de^ndent variable because of 
a deeision as to its place within an analyeis^ whether or not 
it is in fact dependent upon the explanatory variables. Note 
that it is variabiliA^ in the dependent variable which is to be 
understood in terms cr£ variability in independent variables. 
Because of this^ tabular presentations relating explanatory to 
dependent variables should state something like the average 
(arithmetic average^ median^ etc.) or the percentage of "suc- 
cessfes," etc. Thus ^ a table relating recidivism and occupa- 
tional level would probably make more sense if Recidivism rates 
were stated for each of several occupational strata (that is, 
recidivism is the dependent variable and occupational leval is 
the independent variable)'^ rather than one which stated the per- 
centage of semi-skilled persons (say) for each of several levels 
of recidivism. ^ "^"'"^ 



PERCENTAGE RECIDIVISM 
BY 

OCCUPATIONAL STATUS 



Unskilled^ 
Labor 



Skilled 
Labor 



Clerical 



Managerial 
and 

Professional 



Total 



PERCENTAGE UNSKILLED 
BY 

TIME TO RECIDIVISM 



4 Less than 
two months 



Less than Less than Less than 

six months twelve months two years 



Total 



To sunmarize/ a variable is constituted of some number of 
categories or values such that, for any given observation # one 
and only one of the categories is appropriate* Which is to say 
that, ideally, the categories are exhaustive and mutually exclu' 
sive over some class of observations. 



-35-- 

4^ 



APPENDIX B 



DISCUSSldN I CORRELATION 



A great deal of ^valuation work involves relating two or 
mor# variables (see AppendiK A) , In the strict^ technical 
mmn0^f the term^ "correlation # " refers to a limited set of sta- 
tistical measures^ or coefficients. More generally^ however, 
We pay that two variables are correlated if they show. some 
mss^ciation and this will serve as the basis of this discussion - 

As an example^ consider the following data derived from 
Tabl^ 3 ^ Pre^Adjudicatory Detention in Three Juvenile Courts, 
(11.0* Department of Justice, LEAA, NCJIS, Utilization of Criminal 
Justice Statistics — Analytic Report 8, 1975) i 



Detention Decision Outcomes ^ 
by Sex 
Memphis=Shelby County 



Detention Decision 

Outcome Female Male Total 

Not detained 46.3% 57.9% 

(978) (3,238) (4,216) 

I Detained 53.7% 42.1% 

(1,135) (2,354) (3,489) 

TOTAL . (2,111) (5,592) (7,705) 



A g^eat deal of information eKists in such a tabular presentation 
and this is only a portion of the published table, which included 
dat^ for three areas in addition to Memphis-Shelby County. An 
orderly inspection of the table will draw out the ^ following 
pl^^iS of information i 

• a total of 7,705 cases are represented ; 



-36- 



43 



• more than twice as many males as females are 
represented (5,592 males and 2,113 females) i 

• somewhat more of the cases were not detained 
than were detained <4,216 were not detained 
and 3,489 detained)* 

At this stage, we have exhausted the univariate (single variable) 
information available and are prepared to inspect the internal 
distribution which is termed blvariate as it considers the joint 
distribution of sex and detention decision outcome. Note that 
the percentages sum to one hundred going down the columns. That 
is, percentages have been computed on the basis of sex so that 
we can speak of the percentage of women who are detained (53.7%) 
as compared to the percentage of men who are detained (42,1%), 
Because we chose to have percentages sum to one hundred for each 
sex, we would say that sex is the independent variable and de= 
tention decision outcome is the dependent variable. This is the 
appropriate decision if we wish to use sex to enhance our under- 
standing of detention decision outcome. On the other hand, had 
we been interested in assessing needs for female and male deten- 
tion capacities, we would be better served by reversing th# roles 
of the two ,variables-"specif ically , that males constitute two- 
thirds of those detained (2,354/3,489). It should be noted that 
many popular computer programs for such tabular analyses report 
three percentages for^ each cell in a table i 

• percentages based on column totals> (as in our 
example) ; 

percentages based on row totals; 

• percentages based on %4ie t^ble (corner) total. 

Each of these figures serves a different analytic purpose, but 
can overwhelm the researcher who is not very clear about the 
purpose of the analysis. 

The general question of correlation or association of 
variables may be put as follows: 

Does knowledge of the status of one variable 
affect the expectation of the status of a 
second variable? 

In the example of detention decision outcomes and sex, above, we 
find that females are detained more o£ten (relatively) than 
males (53.7% versus 42,1%), In this instance, then, we can indeed 
say that the two variables are associated. Had the two percen- 
tages been essentially identical, on the other hand, we would 
have concluded that there was no association or correlation 
between" sex and detention decisions. One important point should 
be made at this juncture: 



This is not to say that seK causes differen- 
tial detention prospects. 

That "correlation is not causation" is a message ernpha-| 
siied in most introductory statistics courses. Most measures 
of association are symmetric in that^if A is related to B, 
then it is also the case that B is related to A. However^ we 
tend to think of causation as non^symmetric (if A causes B, 
then B does not cause A) except in certain positive feedback 
("recurcive" ) situations in which "failure begets failure." 

At the same time, we tend to causation as a suf- 

ficient condition for correlation hat if there is a 

causal link, we expect a correlati ^ as we 1) . 

Errors of measurement are signiii^ lut in interpreting 
correlations because if the errors of measurement in two 
variables are uncorrelated , the correlation of the two variables 
will be diminished.^ Of course, if the errors of measurement are 
correlated, the observed correlation may be inflated (we say "may" 
because correlations may be either positive or negative) . 

Spurious correlation is a term which reminds us again that 
correlation is" not causa"tion * The term itself is a misnomer for 
it is not the correlation which is spurious but rather the sim- 
plistic interpretation of the correlation is spurious. 

The standard notion of a spurious correlation is that two 
variables (say, X and Y) are correlated because they are both^ 
the effects of some thirds common variable = (say , Z) , More ■ > ^ 
generally, the effect of the third variable is to modify /^^r^'r. 
correlation which would "otherwise" occur^ between the two ^amary 
variables * 

For example^ the author once correlated involvement in a 
prison vocational training program with post-release success and 
found the association to be negative* That is^ participation in 
the training program was predictive of failure, post release- — 
where failure was defined in terms of recidivism* This correla- 
tion could be termed spurious if the "obvious" interpretation 
were accepted, namely that participation in the program promoted 
recidivism* instead, a third variable^ which was composed of 
background factors and found to predict failure, such as educa- 
tional attainment and previous occupational level) , was introduced 
into the analysis* 

It was found that those who were low on this background 
variable (less education, lower prior occupational experience) 
were also more likely to participate in the in-^prison vocational 
training program* Taking this fact into account, the effect o'f 
program participation appeared to be in the direction of success 
rather than failure. That is, had the sample of observations 
been divided into groups with similar predicted outcomes on the 
basis of background, we would compare program participation with 



post-Eelease outcome. In this case, we would say that back- 
ground had been "controlled" such that participation and success 
are^now positively correlated. In the actual analysis, a tech- , 
nique called partial correlation was employed with the back- 
ground variable^aid to be "partialled." 

Ecological correlations have already been discussed and 
in their most simple form they are correlations based on "col- 
lective" or "areal" units. In point of fact, the term "ecolo- 
gical," is applied in much the same manner as "spurious" in that 
it says as much about the person using the term as it does about 
the correlation which is being discussed. That is, the real 
concern is with the ecological fallacy which involves attribu- 
ting egological level findings to individual units. For example, 
if neighborhoods or even cities can be said to vary in terms of 
their "tolerance" for various forms of deviance such that some 
areas tend to be low on the! several types of deviance, then we 
would expect ecological correlations among the several types of 
deviance to be positive. 

The point is that the ecological correlation does not indi- 
cate that one form of deviance affects another form of deviance. 
To reach such a conclusion regarding individual behavior on the 
basis of an ecological correlation would be to commit the eco- 
logical fallacy. 

The Pearson product moment correlation coefficient 
measure typically assumed when the word, "correlation," 
used in its technical sense. The mathematical gualities 
coefficient are those of a simple model which assumes various 
things about the variables and their relationship, such as: 

. ., the relationship is linear; 

the measurement scales of the variables are 
"equal interval i" ^ 

the errors of measurement in the two 
variables are uncorrelated , 



Other measures of association are available for the situation in 
which the preceding assumptions do not seem reasonable. 



^ 'v. APPENDIX C ' 

i ' DISCUSSION I EXPERIMENTS 

EKperiments provide the classic basis on which to attribute 
causal obnnactions between "treatments" and "dependent variables, 
The^ slmplea.t , experiment involves th# random assignment of cases 
to different treatment conditions (or levels of the independent 
*varlabl^) in' order to investigate the effect of differential 
treatmehp^ upon change in one or more dependent variable* A 
brief, disqtasslon should assist in understanding the power and 
the limitations of this research tool. 

r • ' ' : " ■ 

The random assignment of cases to treatment conditions is 
the linique attribute of eKperiments, Its rationale is indeed 
int^iquinf ^ through the rule of "ignorance" we hope to overcome 
any bias which ntight "be introduced by "intentional" assignment 
to treatment conditions . The ij^portant point is that a major 
challenge t© ;thp validity of a study which attempts to assess 
diff Frances in outcbme between two or more treatment conditions 
is, Lack of evidence thjat the groups subjected to different con= 
d4.tions werd themselves identical to each other prior to the 
experiment intervention* It is important^ furthermore ^ to note 
that random assignment does not assure that "all bias" is 
V removed nor that the groups are absolutely identical. What dif= 
ferences may occur, however, are subject to known statistical 
distrib^tions and , therefore, may be taken into account. 

^ . ^ ^ ^ / ' : . i 

It is sometimes argued that a "matched control group" is 
a satisfactory alternative to a random assignment. By construc- 
ting a matching .groupi, it is presumed that the investigator 
knows all relevant variables and that they can be matche(J-=a 
vfery stroVig assumption; for, pf course, to match groups with 
respect tcPsome variable, the variable must be amenable to 
^erfsonably^ accurate measurement. On the other hand, utilizing 
the ignorance of randomization does not require any knowledge 
with respect to relevant^ variables . Thus, because of the crucial 
difference, we suggest that a non-randomly assigned control group 
be referred to as a "pseudo-'control group" or a "comparison group 
and retain the term, , "control group," for the randomized case. 



A second impor^nt challenge to the validity of a study 



is the case of differential attrition of the several groups 
defined in terms of treatment conditions. That is, mV^n if 
the investigator has been careful to randomize group assign- 
ments thereby blocking the potential for bias from self- 
selection, removal from the experiment may not be totally under 
the control of the investigator, thereby introducing the poten- 
tial for him from self-selection out of the experiment. For 
example, a jurisdiction's fiscal crisis may cause termination 
of some program and a participant in a therapeutic community may 
commit suicide. 

Several approaches are available for use in instances when 
random assignment has not been possible. While attractive, they 
should not be thought to be the equal of random assignments. 

The "quasi-experiment" is a powerful tool where information 
is available over time. If a series of measures over a period 
of time is available, it can be used to establish a trend. Pre- 
dictions based on the preexisting trend can then be compared 
to post intervention measures. In effect, this procedure is 
based on a kind of "what if" thinking in that we form expecta- 
tions for the value of an independent variable based on an 
assumption of continuity over time if the intervention had not 
occurred . 

Covari ance .adjustment is a statistical technique whereby 
one sSeks to separate the dependent variable into intervention 
or treatment effects and "other" effects. This procedure re- 
quires that variables be identified which are associated with 
or predictive of the dependent variable. Any differences prior 
to intervention of the groips in terms of these variables are 
then 'taken into account in interpreting post-intervention 
differences in the groups. While this approach seems elegantly - 
simple, there are many questions which remain and they go 
beyond the scope of this brief review. 

Subj ect mat ching is an oft-used technique within the non- 
experimental domain. While worthwhile in partially reducing 
pre-interventinn differences, it cannot be qonsidered an adequate 
approach alone. Rather, matching can be viewed as complementary 
to covariance adjustment. One potential interpretational pit- 
fall of matching is that consumers of the research report may 
be insensitive to the crucial distinction between post hog 
matching and random assignment. Thus, it is important to warn 
the evaluation consumer that the evidence of a matched study is 
not as strong as that of an experiment. 

A final note is appropriate regarding the relative power 
of experimental and non-experimental techniques. While the pure 
experiment can yield very "clean" results, various constraints 
on the use of oxpcrimGnts at largo levels within society may 
cause non-experiments to be superior in specific areas, duo to 
the very large range of diversity of environments in which 
results can bo evaluated. 

" -41- tHJl QOViPNMINTfBINTiNUOMICi |n7ii uri SB;i 11 

■1? 



ERIC 



